367 research outputs found

    Coordination Implications of Software Coupling in Open Source Projects

    Get PDF
    The effect of software coupling on the quality of software has been studied quite widely since the seminal paper on software modularity by Parnas [1]. However, the effect of the increase in software coupling on the coordination of the developers has not been researched as much. In commercial software development environments there normally are coordination mechanisms in place to manage the coordination requirements due to software dependencies. But, in the case of Open Source software such coordination mechanisms are harder to implement, as the developers tend to rely solely on electronic means of communication. Hence, an understanding of the changing coordination requirements is essential to the management of an Open Source project. In this paper we study the effect of changes in software coupling on the coordination requirements in a case study of a popular Open Source project called JBoss

    Robots that can adapt like animals

    Get PDF
    As robots leave the controlled environments of factories to autonomously function in more complex, natural environments, they will have to respond to the inevitable fact that they will become damaged. However, while animals can quickly adapt to a wide variety of injuries, current robots cannot "think outside the box" to find a compensatory behavior when damaged: they are limited to their pre-specified self-sensing abilities, can diagnose only anticipated failure modes, and require a pre-programmed contingency plan for every type of potential damage, an impracticality for complex robots. Here we introduce an intelligent trial and error algorithm that allows robots to adapt to damage in less than two minutes, without requiring self-diagnosis or pre-specified contingency plans. Before deployment, a robot exploits a novel algorithm to create a detailed map of the space of high-performing behaviors: This map represents the robot's intuitions about what behaviors it can perform and their value. If the robot is damaged, it uses these intuitions to guide a trial-and-error learning algorithm that conducts intelligent experiments to rapidly discover a compensatory behavior that works in spite of the damage. Experiments reveal successful adaptations for a legged robot injured in five different ways, including damaged, broken, and missing legs, and for a robotic arm with joints broken in 14 different ways. This new technique will enable more robust, effective, autonomous robots, and suggests principles that animals may use to adapt to injury

    Understanding the Sources of Variation in Software Inspections

    Get PDF
    In a previous experiment, we determined how various changes in three structural elements of the software inspection process (team size, and number and sequencing of session), altered effectiveness and interval. our results showed that such changes did not significantly influence the defect detection reate, but that certain combinations of changes dramatically increased the inspection interval. We also observed a large amount of unexplained variance in the data, indicating that other factors much be affecting inspection performance. The nature and extent of these other factos now have to be determined to ensure that they had not biased our earlier results. Also, identifying these other factors might suggest additional ways to improve the efficiency of inspection. Acting on the hypothesis that the "inputs" into the inspection process (reviewers, authors, and code units) were significant sources of variation, we modeled their effects on inspection performance. We found that they were responsible for much more variation in defect detection than was process structure. This leads us to conclude that better defect detection techniques, not better process structures, at the key to improving inspection effectiveness. The combined effects of process inputs and process structure on the inspection interval accounted for only a small percentage of the variance in inspection interval. Therefore, there still remain other factors which need to be identified. (Also cross-referenced as UMIACS-TR-97-22

    Sequential design of computer experiments for the estimation of a probability of failure

    Full text link
    This paper deals with the problem of estimating the volume of the excursion set of a function f:Rd→Rf:\mathbb{R}^d \to \mathbb{R} above a given threshold, under a probability measure on Rd\mathbb{R}^d that is assumed to be known. In the industrial world, this corresponds to the problem of estimating a probability of failure of a system. When only an expensive-to-simulate model of the system is available, the budget for simulations is usually severely limited and therefore classical Monte Carlo methods ought to be avoided. One of the main contributions of this article is to derive SUR (stepwise uncertainty reduction) strategies from a Bayesian-theoretic formulation of the problem of estimating a probability of failure. These sequential strategies use a Gaussian process model of ff and aim at performing evaluations of ff as efficiently as possible to infer the value of the probability of failure. We compare these strategies to other strategies also based on a Gaussian process model for estimating a probability of failure.Comment: This is an author-generated postprint version. The published version is available at http://www.springerlink.co

    Bayesian optimization for materials design

    Full text link
    We introduce Bayesian optimization, a technique developed for optimizing time-consuming engineering simulations and for fitting machine learning models on large datasets. Bayesian optimization guides the choice of experiments during materials design and discovery to find good material designs in as few experiments as possible. We focus on the case when materials designs are parameterized by a low-dimensional vector. Bayesian optimization is built on a statistical technique called Gaussian process regression, which allows predicting the performance of a new design based on previously tested designs. After providing a detailed introduction to Gaussian process regression, we introduce two Bayesian optimization methods: expected improvement, for design problems with noise-free evaluations; and the knowledge-gradient method, which generalizes expected improvement and may be used in design problems with noisy evaluations. Both methods are derived using a value-of-information analysis, and enjoy one-step Bayes-optimality

    Effort estimation of FLOSS projects: A study of the Linux kernel

    Get PDF
    This is the post-print version of the Article. The official published version can be accessed from the link below - Copyright @ 2011 SpringerEmpirical research on Free/Libre/Open Source Software (FLOSS) has shown that developers tend to cluster around two main roles: “core” contributors differ from “peripheral” developers in terms of a larger number of responsibilities and a higher productivity pattern. A further, cross-cutting characterization of developers could be achieved by associating developers with “time slots”, and different patterns of activity and effort could be associated to such slots. Such analysis, if replicated, could be used not only to compare different FLOSS communities, and to evaluate their stability and maturity, but also to determine within projects, how the effort is distributed in a given period, and to estimate future needs with respect to key points in the software life-cycle (e.g., major releases). This study analyses the activity patterns within the Linux kernel project, at first focusing on the overall distribution of effort and activity within weeks and days; then, dividing each day into three 8-hour time slots, and focusing on effort and activity around major releases. Such analyses have the objective of evaluating effort, productivity and types of activity globally and around major releases. They enable a comparison of these releases and patterns of effort and activities with traditional software products and processes, and in turn, the identification of company-driven projects (i.e., working mainly during office hours) among FLOSS endeavors. The results of this research show that, overall, the effort within the Linux kernel community is constant (albeit at different levels) throughout the week, signalling the need of updated estimation models, different from those used in traditional 9am–5pm, Monday to Friday commercial companies. It also becomes evident that the activity before a release is vastly different from after a release, and that the changes show an increase in code complexity in specific time slots (notably in the late night hours), which will later require additional maintenance efforts

    Bayesian Optimization Approaches for Massively Multi-modal Problems

    Get PDF
    The optimization of massively multi-modal functions is a challenging task, particularly for problems where the search space can lead the op- timization process to local optima. While evolutionary algorithms have been extensively investigated for these optimization problems, Bayesian Optimization algorithms have not been explored to the same extent. In this paper, we study the behavior of Bayesian Optimization as part of a hybrid approach for solving several massively multi-modal functions. We use well-known benchmarks and metrics to evaluate how different variants of Bayesian Optimization deal with multi-modality.TIN2016-78365-

    Cisplatin-resistant triple-negative breast cancer subtypes: multiple mechanisms of resistance.

    Get PDF
    BACKGROUND: Understanding mechanisms underlying specific chemotherapeutic responses in subtypes of cancer may improve identification of treatment strategies most likely to benefit particular patients. For example, triple-negative breast cancer (TNBC) patients have variable response to the chemotherapeutic agent cisplatin. Understanding the basis of treatment response in cancer subtypes will lead to more informed decisions about selection of treatment strategies. METHODS: In this study we used an integrative functional genomics approach to investigate the molecular mechanisms underlying known cisplatin-response differences among subtypes of TNBC. To identify changes in gene expression that could explain mechanisms of resistance, we examined 102 evolutionarily conserved cisplatin-associated genes, evaluating their differential expression in the cisplatin-sensitive, basal-like 1 (BL1) and basal-like 2 (BL2) subtypes, and the two cisplatin-resistant, luminal androgen receptor (LAR) and mesenchymal (M) subtypes of TNBC. RESULTS: We found 20 genes that were differentially expressed in at least one subtype. Fifteen of the 20 genes are associated with cell death and are distributed among all TNBC subtypes. The less cisplatin-responsive LAR and M TNBC subtypes show different regulation of 13 genes compared to the more sensitive BL1 and BL2 subtypes. These 13 genes identify a variety of cisplatin-resistance mechanisms including increased transport and detoxification of cisplatin, and mis-regulation of the epithelial to mesenchymal transition. CONCLUSIONS: We identified gene signatures in resistant TNBC subtypes indicative of mechanisms of cisplatin. Our results indicate that response to cisplatin in TNBC has a complex foundation based on impact of treatment on distinct cellular pathways. We find that examination of expression data in the context of heterogeneous data such as drug-gene interactions leads to a better understanding of mechanisms at work in cancer therapy response

    The Comparative Toxicogenomics Database: update 2011

    Get PDF
    The Comparative Toxicogenomics Database (CTD) is a public resource that promotes understanding about the interaction of environmental chemicals with gene products, and their effects on human health. Biocurators at CTD manually curate a triad of chemical–gene, chemical–disease and gene–disease relationships from the literature. These core data are then integrated to construct chemical–gene–disease networks and to predict many novel relationships using different types of associated data. Since 2009, we dramatically increased the content of CTD to 1.4 million chemical–gene–disease data points and added many features, statistical analyses and analytical tools, including GeneComps and ChemComps (to find comparable genes and chemicals that share toxicogenomic profiles), enriched Gene Ontology terms associated with chemicals, statistically ranked chemical–disease inferences, Venn diagram tools to discover overlapping and unique attributes of any set of chemicals, genes or disease, and enhanced gene pathway data content, among other features. Together, this wealth of expanded chemical–gene–disease data continues to help users generate testable hypotheses about the molecular mechanisms of environmental diseases. CTD is freely available at http://ctd.mdibl.org
    • 

    corecore